JARQUE_BERA

Overview

The JARQUE_BERA function performs a goodness-of-fit test to determine whether sample data have the skewness and kurtosis matching a normal distribution. Named after economists Carlos Jarque and Anil K. Bera, who developed it in 1980, the test is widely used in econometrics and financial analysis to validate normality assumptions required by many statistical models.

The Jarque-Bera test examines two key properties of the data distribution: skewness (asymmetry) and kurtosis (tail heaviness). For a normal distribution, the expected skewness is 0 and the expected excess kurtosis is also 0 (equivalent to a kurtosis of 3). The test statistic quantifies how far the sample deviates from these expected values.

The test statistic JB is calculated as:

JB = \frac{n}{6}\left(S^2 + \frac{1}{4}(K-3)^2\right)

where n is the sample size, S is the sample skewness, and K is the sample kurtosis. The term (K-3) represents the excess kurtosis, measuring deviation from the normal distribution’s kurtosis of 3.

Under the null hypothesis that the data comes from a normal distribution, the JB statistic asymptotically follows a chi-squared distribution with 2 degrees of freedom. A large JB value (far from zero) indicates significant departure from normality. The function returns both the test statistic and the p-value; a small p-value (typically < 0.05) suggests rejecting the null hypothesis of normality.

This implementation uses SciPy’s jarque_bera function from the scipy.stats module. Note that the chi-squared approximation is most reliable for large sample sizes (>2000); for smaller samples, the test may be overly sensitive and produce inflated Type I error rates. For more background, see the Wikipedia article on the Jarque-Bera test or the original paper by Jarque and Bera (1980).

This example function is provided as-is without any representation of accuracy.

Excel Usage

=JARQUE_BERA(data)
  • data (list[list], required): Sample data to test for normality. Must contain at least two numeric values.

Returns (list[list]): 2D list [[statistic, p_value]], or error message string.

Examples

Example 1: Normally distributed data

Inputs:

data
0.1
-0.2
0.3
0
0.2
-0.1

Excel formula:

=JARQUE_BERA({0.1;-0.2;0.3;0;0.2;-0.1})

Expected output:

Result
0.4023 0.8178

Example 2: Uniformly distributed data

Inputs:

data
1
2
3
4
5
6

Excel formula:

=JARQUE_BERA({1;2;3;4;5;6})

Expected output:

Result
0.4023 0.8178

Example 3: Evenly spaced data

Inputs:

data
2.5
2.7
2.9
3.1
3.3
3.5

Excel formula:

=JARQUE_BERA({2.5;2.7;2.9;3.1;3.3;3.5})

Expected output:

Result
0.4023 0.8178

Example 4: Data with outlier

Inputs:

data
0
0.1
0.2
0.3
0.4
5

Excel formula:

=JARQUE_BERA({0;0.1;0.2;0.3;0.4;5})

Expected output:

Result
3.464 0.1769

Python Code

from scipy.stats import jarque_bera as scipy_jarque_bera
import math

def jarque_bera(data):
    """
    Perform the Jarque-Bera goodness of fit test for normality.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.jarque_bera.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): Sample data to test for normality. Must contain at least two numeric values.

    Returns:
        list[list]: 2D list [[statistic, p_value]], or error message string.
    """
    def to2d(x):
        return [[x]] if not isinstance(x, list) else x

    def flatten(arr):
        result = []
        for row in arr:
            if isinstance(row, list):
                result.extend(row)
            else:
                result.append(row)
        return result

    data = to2d(data)

    if not isinstance(data, list) or not all(isinstance(row, list) for row in data):
        return "Invalid input: data must be a 2D list."

    flat = flatten(data)
    values = []
    for val in flat:
        try:
            f = float(val)
            if math.isnan(f) or math.isinf(f):
                return "Invalid input: data must contain only numeric values."
            values.append(f)
        except (TypeError, ValueError):
            return "Invalid input: data must contain only numeric values."

    if len(values) < 2:
        return "Invalid input: data must contain at least two numeric values."

    try:
        result = scipy_jarque_bera(values)
        stat = float(result.statistic)
        pval = float(result.pvalue)
    except Exception as e:
        return f"Calculation error: {e}"

    if math.isnan(stat) or math.isinf(stat) or math.isnan(pval) or math.isinf(pval):
        return "Calculation error: result contains NaN or infinity."

    return [[stat, pval]]

Online Calculator